Summary:

The authors present an architecture that focuses on reducing costly memory access and comparison operations for implementing accelerators for Q-Learning on modern FPGAs.They also propose multiple variations of the proposed architecture that provide varying power-performance trade-offs.

Pros:

1. They have so many comparison tables and figures for showing their result. Algorithm is well explained by them.

Cons:

1. The paper has no novelty. It actually improves the previous Q-Learning algorithms[5]. I am not sure that it is enough for publishing a paper in this international conference.
2. In algorithm 1, the while loop is not converged as k values are not increasing in the while loop. The authors have cited the reference from where they take the algorithm but there are some parameters 𝑅, 𝛼, 𝛾 and 𝜖 which values are not specified in the algorithm. As they improved the algorithm, values of these parameters should be specified.

Score: 2 out of 5